Automatic Generation of Prosody: Comparing Two Superpositional Systems
نویسندگان
چکیده
We face many options when designing a system that automatically generates prosody from linguistic and paralinguistic information. The literature provides several candidate phonetic models, phonological models and mapping tools to actually implement the system. We detail here some dimensions along which these models have to be compared. We show also that systems employing quite similar phonetic models can still have radically different approaches. We present results of a first evaluation comparing two systems using a superpositional model of melody on a common multilingual prosodic database of spoken math formulae. We conclude that prosodic models and intonation theories could certainly benefit from well-defined tasks and fair benchmarks.
منابع مشابه
Generating the Prosody of Attitudes
This paper presents a superpositional model of prosody where linguistic structures are directly encoded into the prosodic parameters via global melodic and rhythmic contours. This model is applied here to the generation of prosody specific to six common attitudes in French. Perceptual tests using high-quality TD-PSOLA re-synthesis show that predicted contours yield the same identification score...
متن کاملThe New Slovenian Text-to-Speech System
Human-computer interaction in a natural language is becoming possible due to rapid development of computer power. While text-to-speech (TTS) systems for major world languages are quite advanced, smaller languages, like our Slovenian language, lack quality TTS synthesis. At the "Jozef Stefan" Institute a system called GOVOREC (SPEAKER) has been developed which is capable of automatic conversion ...
متن کاملThe use of F0 reliability function for prosodic command analysis on F0 contour generation model
This paper describes a method of utilizing an “F0 Reliability Field” (FRF), which we have proposed in our previous work, for estimating prosodic commands on F0 contour generation model. This FRF is the time-frequency representation of F0 likelihood, and an advantage of FRF is that it is not necessary to consider F0 errors that occur during an automatic F0 determination. Therefore, it is thought...
متن کاملEmploying Sentence Structure: Syntax Trees as Prosody Generators
In this paper, we describe a prosody generation system for speech synthesis that makes direct use of syntax trees to obtain duration and pitch. Instead of transforming the tree through special rules or extracting isolated features from the tree, we make use of the tree structure itself to construct a superpositional model that is able to learn the relation between syntax and prosody. We impleme...
متن کاملOn representation of fundamental frequency of speech for prosody analysis using reliability function
This paper highlights on a method that provides a new prosodic feature called ‘ reliability field’ based on a reliability function of the fundamental frequency ( ). The proposed method does not employ any correction process for estimation errors that occur during automatic extraction. By applying this feature as a score function for prosodic analyses like prosodic structure estimation or superp...
متن کامل